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Abstract — Given the growing, and often critical, use of location 
information in emerging mobile applications, the development of 
an optimal Location Verification System (LVS) is of increasing 
importance. This is especially so for network-based Intelligent 
Transport Systems, where verification of location information 
is paramount. In this work, for the first time, a rigorous 
information-theoretic framework for a general LVS is developed 
in which the LVS's decision rule is optimized in terms of the 
mutual information between its input and output data. We are 
specifically interested in the general attack scenario in which a 
non-colluding malicious user outside the perimeter of a network 
region (e.g. a highway) optimally boosts his transmit power 
in an attempt to fool the LVS that he is inside the network 
region. Developing specific threat models for this attack scenario, 
we investigate in detail, both analytically and numerically, the 
performance of the LVS in terms of its input/output mutual 
information. Our work shows how the LVS's decision rule can be 
implemented straightforwardly with a performance that delivers 
near-optimality under realistic threat models, with information- 
theoretic optimality approached as the malicious user moves 
further from the network region. 

Index Terms — Location Verification, Wireless Networks, Mu- 
tual Information, Likelihood Ratio Test, Decision Rule, Threat 
Model. 



I. Introduction 

Location verification in wireless networks is of growing 
importance. This is in part a consequence of the growing 
number of mobile services that utilize location information, 
but also in part due to the mission-critical nature the location 
information being supplied has on the performance, security 
and safety of some services. The importance of location 
verification is perhaps best illustrated in emerging Intelligent 
Transport Systems (ITS) where the verification of the location 
information supplied by vehicles is vital to the safety issues 
ITS hope to address (T). Indeed, recently there has been much 
effort in analyzing how a Location Verification System (LVS) 
in the context of ITS may operate |2)-||9]. Such ITS-based 
LVS work is also considered in other recent research efforts on 
location verification in more generic wireless network settings 

(e.g. m-my 

In an LVS one aims at verifying a user's claimed position 
based on some input measurements so as to perform a binary 
decision on whether the user is legitimate (claims his true posi- 
tion) or malicious (spoofs his claimed position). In general, an 
LVS aims at obtaining a low false positive rate for legitimate 
users and a high detection rate for malicious users, leading 
to a tradeoff perhaps best illustrated by a receiver operating 



characteristic (ROC) curve. However, it is established that 
ROCs are not always ideal in comparing performances of two 
separate systems (e.g. l23ll . 1241 ). It is also the case that the 
use of a ROC does not in any formal sense indicate what the 
optimal operating point of an LVS is. A possible direction to 
follow in attempting an optimization of an LVS is to utilize 
Bayesian hypothesis tests based on a Likelihood Ratio Test 
(LRT) - which minimizes the input/output classification error 
in the scenario where the cost of all types of misclassifications 
are equal [171 . Additionally, if the costs of misclassifications 
are not equal, then a variation of the LRT decision rule can 
be formed, namely the Bayes criterion (25). The Maximum A 
Posteriori criterion and the Maximum Likelihood criterion are 
special cases of the Bayes criterion. However, it is well known 
that these Bayes-decision criteria possess a weakness - they 
are subjective. This subjectivity arises through the necessity 
to pre-assign costs to the different types of misclassification. 
It has been discussed before how such subjectivity in Bayes 
criteria can give rise to confusion when comparisons of 
detector performances are made |23ll , 1241 . As such, although 
many of the previous works on LVSs have their own specific 
verification performance goals in mind, and their own pros 
and cons, none of these works identify an optimal LVS in any 
non-subjective sense. 

To make progress, what is actually required is an objective 
measure of detector performance, namely a single unified 
metric that takes into account all key aspects of intrusion 
detection in an objective fashion. As argued in 0241 . this 
menic should be the information-theoretic mutual information, 
and it is this approach we develop here in the context of 
location verification. More specifically, we develop here an 
information-theoretic framework for an LVS in which the 
mutual information between the input and output LVS data 
is used as the objective optimization criterion. 

In general an LVS can be characterized as follows. The 
input data (users to be verified) are represented by binary 
random variables X = x, x £ [0, 1], whose realized elements 
indicate legitimate (x = 0) or malicious (x = 1). Likewise 
the output data can be represented by binary random variables 
Y = y, y <E [0,1], whose realized elements indicate the binary 
decision made by the LVS, namely verified (y = 0) or not 
verified (y — 1). In the LVS, a decision rule is formed which 
indicates whether a user is malicious or not. This decision 
rule ultimately forms a test on whether some statistic (derived 
from network measurements and some prior information) is 
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less than or equal to some threshold. With these definitions 
in place, the contributions of this paper can be specifically 
summarized thus. Q 

1) We develop for the first time an information-theoretic 
framework for an LVS, which allows us to utilize the 
mutual information between X and Y as a unique 
criterion to evaluate and optimize the performance of 
an LVS. 

2) Under the assumption of known likelihood functions for 
the measurements, we prove that the likelihood ratio 
is the test statistic that produces the maximum mutual 
information between X and Y . 

3) Identifying the threshold value that maximizes the mu- 
tual information between X and Y, we then show 
how the Likelihood Ratio Test (LRT) is the decision 
rule which maximizes the mutual information between 
X and Y, and leads to the information-theoretic opti- 
mal LVS. We take the further step of determining the 
likelihood functions under a series of threat models. 
This leads to a working LVS that will be an optimal 
information-theoretic approach under the given threat 
models. 

4) We show from our analysis how an effectively optimal 
LVS, which is simple to deploy in practice, can be 
developed. We show that our LVS leads to an effectively 
optimal solution for most realistic attack scenarios in 
which a malicious user who is outside a network region, 
is attempting to spoof that he is within the network 
region. We further show how optimality is approached 
as the malicious user moves further from the network 
region. 

The remainder of the paper is structured as follows: Section 
Inl presents both the general network system model and our 
information-theoretic LVS framework. The decision rule that 
optimizes mutual information is constructed in Section Hn] 
In Section IIVI analytical and simulations of our LVS are 
presented for a series of threat models. A discussion on the 
optimality of our LVS is also given at this point. Section [V] 
concludes the paper. 

II. System Model and LVS Framework 

In this section, we first present the general location veri- 
fication system model and the related assumptions. Then, we 
develop an information-theoretic framework for an LVS, which 
allows us to utilize the mutual information between X and Y 
as an unique criterion to evaluate and optimize an LVS. 

A. General System Model 

The values of the input data X can be represented as two 
hypotheses. The first of these is the null hypothesis, Ho, which 
assumes the user to be verified is legitimate (x = 0). The 
second one is the alternative hypothesis, Hi, which assumes 
the user to be verified is malicious (x = 1). Likewise, the 
possible values of the output data Y can be represented as 

'Some preliminary results on this work based on sub-optimal (non-LRT) 
decision rules were presented at Globecom 2012 1261 . 



two decisions, where Vq denotes verified (y = 0), and T>\ 
denotes not verified (y = 1). We now outline the general LVS 
model, and detail the assumptions we use. 

1) A single user (legitimate or malicious) reports his 
claimed location, 9 C = (u c , v c ) E M 2 , to a network with 
K (K > 2) Base Stations (BSs) in the communication 
range of the user (the K BSs are not in a line), where 
Of s = (uf s ,vf s ) G R 2 is the location of the ith BS, 
i = 1,2, K. One of the K BSs is the Process Center 
(PC), and all other BSs will transmit the measurements 
collected from the user to the PC. The PC is to make 
decisions based on the user's claimed location and the 
measurements collected by all the K BSs. We assume 
all BSs are perfectly synchronized. 

2) We assume a user (legitimate or malicious) knows the 
locations of the K BSs, and that 9 C is supplied by the 
user to the PC. 

3) For the legitimate user, we assume the true location is 
given by 9 = 9 C (here we will ignore the small location 
determination error, e.g. the GPS erroJl). We assume 
the malicious user's true location 9 = (u, v) € M 2 is 
known exactly to him (i.e. again we ignore any small 
localization error), but is unknown to the network. 

4) We assume 9 is a bivariate random variable following 
some distribution. The prior distribution, i.e. the proba- 
bility density function (pdf), for 9 under Hi is denoted 
as p(e\Hi). 

5) In general, the measurement (Mi) collected by the ith 
BS from a legitimate user is dependent on 9f s and the 
legitimate user's 9 C . In practice, a malicious user can 
impact the measurements collected by all BSs in order to 
avoid detection. Thus, the measurement (M,) collected 
by the ith BS from a malicious user is some function 
of Of and the malicious user's 9 and his spoofed 9 C . 
Therefore, the measurement (Mi) collected by the ith 
BS can be given as a composite model as follows: 

f H o :M i = h o (6? s ,0 c ,uj) 

\ Hi:M i = h 1 (ef s ,e c ,e,uj), ( ' 

where ho and h\ are some functions yet to be specified 
(can involve additional parameters), and ui is random 
variable representing the communication channel noise. 
Given the statistical nature of u>, the composite system 
model in (Q~|) can produce the likelihood functions un- 
der Ho and Hi, which are denoted as p(m\Ho) and 
p(m\Hi), respectively, where m = [mi,TO2, ttik] 
is a realization of the measurement vector, M = 
[Mi, Ms, M K ]. 

6) We also assume a user is legitimate with a known prior 
probability, which is Po = P(x = 0). The probability of 
a user to be malicious is denoted as Pi, and Pi = 1 — Po- 

B. Information-Theoretic Framework for an LVS 

In general, the purpose of an LVS is to map the input data 
X to the output data Y, X — > Y, and can be represented as 

2 When this error is much smaller than the average distance between BSs 
the effect on the results is negligible. 
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Fig. 1. A Location Verification System (LVS) model. 



shown in Fig. Q] In Fig. Q] the false positive rate, a, and the 
detection rate, j3, are given as follows 

a = P(V 1 \'H ), l-a = P(V \H ), 
p = P(V x \Hi), l-/3 = P(V \Hi), 

where P is the probability of an outcome conditional on 
a hypotheses. The mutual information between X and Y 
can be expressed as I(X;Y) = H(X) - H(X\Y), where 
H(X) is the entropy of X, and H(X\Y) is the conditional 
entropy of X given Y. Given Pg, the entropy of the dis- 
crete binary random variable X can be written as H(X) = 
- Ex P(X) log 2 P(X) = -P log 2 P - (1 - P ) log 2 (l - 
Po). With these definitions, the conditional entropy 
can be expressed as in ||23l 

H(X\Y) = - J2 E P (*> y ) lo S2 P(X\Y) 



Po(l-«) - lo §: 



Po(l-a) 



P (l-a) + (l-P )(l-/3) 



P a -log 2 



P a 



P a+(l-P )/3 
^1 PVi (l-P )(l- < 8) 



P a + (l-P )/3 



(2) 



The information-theoretic optimal location verification al- 
gorithm can be defined as the one which maximizes I(X:Y) 
defined above. 

III. The Optimal Location Verification Algorithm 

In this section, based on the assumption of known likelihood 
functions under both Ho and Hi, we take the additional 
step of identifying the information-theoretic optimal location 
verification algorithm, which produces the maximum I(X; Y) 
relative to any other location verification algorithm. 

A. The Decision Rule for Maximizing I(X; Y) 

In the context of an LVS, a location verification algorithm 
must formulate a decision rule to infer whether the user is 
consistent with Ho or Hi- The algorithm ultimately forms a 



comparison of some test statistic, F(m), and a corresponding 
threshold, Tp, in the form of 

F(m) ^ Tp. 

For a given F(m), we will be interested in the value of Tp 
which maximizes I(X; Y), i.e., T F = argmax/(X; Y). Fur- 

thermore, we will be interested in determining the functional 
form of F(m) that maximizes I(X;Y). This leads to our 
main result, which is stated in Theorem 1. 
Theorem 1: Given the decision rule 



F(m) ^ I 

Po 



(3) 



the functional form of F(m) that maximizes the mutual 
information I(X; Y) is A (m), where 



A(m) 



a p{m\Hi) 



(4) 



p(m\H ) 

To prove Theorem [1] we first introduce two lemmas, of 
which the first is the Neyman-Pearson Lemma l27l . 

Lemma 1: Consider two hypotheses Ho and Hi, the deci- 
sion rule to maximize a detection rate (j3) for a given false 
positive rate (a) is 



Pi 

A(m) ^ T A , 



(5) 



where Ta is determined by the specified value of a. For proof, 
see l27l . Before proceeding, we note a < (3 will be a basic 
requirement for any useful LVS. 

Lemma 2: Given the assumption a < (3, the mutual in- 
formation I(X;Y) is a monotonic increasing function of the 
detection rate (3. 

Proof of Lemma 2: Since H(X) is not dependent on 
(3, the first derivative of I(X;Y) with respect to (3 can be 
expressed as 

dI{X-Y) d(H(X)-H(X\Y)) dH{X\Y) 



d/3 



dp d/3 

/3[P (l-a) + (l-P )(l-/?)] 



= (1 - P0) l 1O& (l-^)[Poa + (l- W] 
= (1 - P ) log 2 {/3[P (1 - a) + (1 - P )(l - 13)}} 

^ v / 

Vo 

- (1 - P )log 2 {[Poa + (1 - P )/3](l - (3)} . 

v ' 

Vi 

Note, since < Pq < 1, and the logarithm is a monotonic 
increasing function of z, then dI(X; Y)/df3 has the same sign 
of (Vo — Vi), where 

V -Vi= /?[P (1 - a) + (1 - P )(l - (3)] 
-[P a+(1-P )/3](1-J3) 
= Po(p-a). 

Thus, given the assumption a < (3, then dI(X;Y)/d(3 > 0, 
and Lemma |2] is proved. ■ 
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Given Lemma [T] and Lemma [2] we now prove Theorem Q] 

Proof of Theorem 1: If the specified value of a in 
Lemma Q] is that which results in the value T F of ([3j, then by 
Lemma |2] the result follows. ■ 

B. The Optimal Location Verification Algorithm 

Based on the above discussion, the optimal location verifi- 
cation algorithm can be summarized as follows: 

• Step 1: Determine the functional forms of ho and hi in 
©■ 

• Step 2: Specify the prior distributions for 9, p(9\Ho) 
and p(0\Hi), and determine the likelihood functions 
p(m\Ho) and p(m\Hi). 

• Step 3: With (0 as the general decision rule, derive the 
functional form of a and /3. Note, a and f3 will be 
function of Ta- 

• Step 4: Using I(X;Y) as the objective function, search 
for X^, which is the value of Ta that maximizes I(X; Y). 

• Step 5: Collect measurements and calculate the likeli- 
hood ratio A(m) according to the likelihood functions 
determined in step 2. 

• Step 6: Form the decision rule, 



A (to) 



T> i 
> 

< 
Do 



(6) 



IV. Specific Optimal Location Verification 
Algorithm with RSS as Measurements 

In order to implement the optimal location verification al- 
gorithm, in this section we take the further step of determining 
the likelihood functions under Ho and Hi with Received 
Signal Strength (RSS) as the system measurements, and we 
consider the algorithm under a series of threat models. 

Although the framework we develop can be built on any 
measurement (location information metric), such as RSS, 
TOA (time of arrival) and TDOA (time difference of arrival), 
for purposes of illustration we focus here only on an RSS 
implementation. In this case, the measurement Mj is the RSS 
(in dB) collected by the ith BS. We will also assume that the 
legitimate user and all BSs are equipped with only a single 
omni-direction antenna. 

Let us define the set of BSs that are within range of a 
legitimate user positioned at 9 C as the in-range BSs. This set 
of BSs forms an effective perimeter for the network used in 
the location verification. We will assume a single malicious 
user has the technology (e.g. directional beam-forming), which 
allows him to ensure (if required) that from some position 
outside the perimeter only the in-range BSs receive a non- 
zero RSS. The malicious user can set the power of the main 
directional beam. We do not allow an adversary to set multiple 
beams to different BSs via colluding malicious users (see later 
discussion). 

Based on the log-normal propagation model 1281 , h in (fTJ 
can be specified as 

'df 



ho{0? s ,O c ,uj) = Po-\0 1 \og 



10 



- w, 



(7) 



where P is a reference received power, d is the reference 
distance, 7 is the path loss exponent, to (in dB) is a zero-mean 
normal random variable with variance cr dB , and the Euclidean 
distance of the ith BS to the user's claimed location 9 C = 

(u c , v c ) is 



di = d(o c 7 en = V(« c - < s ? + (« c - «? s ) 2 - 

A malicious user can adjust his transmit power to impact the 
measurements collected by the BSs, thus h\ in ([U can be 
expressed as 



fci(0f s , 9 C , 9, u) = P +P x - 10 7 log 



id 



+ w, (8) 



where P x is a power level set by the malicious user, and d\ — 
d(9, 9f s ) is the Euclidean distance of the ith BS to the user's 
true location 9 = (u, v). 

Assuming all Mi's are independent from each other, the 
likelihood function p(m\Ho) can be expressed as 



K 



p(m\H ) = Y[p{mi\H ) 



i=l 
K 



TT^— 



exp 



(m* - n C iY 
2a 



(9) 



dB 



where 



^ = P -107log 10 ( ^ 



Also, the pdf of M conditional on 9 under Hi, p(m\9,Hi), 
can be written as 



K 



i{m\9,Hi) = Y[p(mi\e,Hi) 



K 

n 



1 



2-K(J dB 



cxp 



-(m l -Po-P c + 107log 10 (|) 



1°Ib 



(10) 



In general, a malicious user will utilize P x in an attempt 
to impact the measurements collected by the BSs in order 
to avoid detection. We now discuss how to determine the 
'optimal' value of P x from a malicious user's point of view. An 
LVS can be spoofed optimally if the measurements collected 
from a malicious user follow exactly p(m\Ho), which is given 
by ©. Therefore, in order to avoid detection a malicious user 
attempts to minimize the difference between p(m\Ho) and 
p(m\9,Hi). This difference can be quantified through the 
KL-divergence between the two likelihood functions, which 
is defined as follows 



D KL (p(m\Ho)\\p(m\9,Hi)) 



E 



p(m \Ho) In —, — . dm 
p(m\9,Hi) 

(tf-ri-P x ) 2 
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where 



M * = P -107lo gl 



do 



This KL-divergence is the information lost when p(m\6,'Hi) 
is used to approximate p(m\Ho), and it becomes zero if 
and only if the two distributions are identical. From an 
information-theoretic point of view the optimal value of P x 
can be expressed as 



P* = argminD^L (p(m\Ho\\p(m\6,'H 1 )) 

1 K (11) 

= ^£(K-/4). 

Setting P x = P* in ©, h\ can be rewritten as 

h 1 (ef s ,e c ,e,u J )=p +fi c -fi t -io 1 iog 1Q (U) 



where 



1 K 



and 



i=l 



1 K 



i=l 



Although is a known deterministic parameter for a malicious 
user, it is unknown for the network. This means h\ is still 
unknown, and therefore the likelihood function p(m\Hi) is 
unknown for the LVS. To make progress, we will assume 
some realistic threat models within which p(m\Hi) becomes 
known. 

We will first consider a simple threat mode in which the 
only thing the malicious user can do is make all the RSSs 
at the BSs appear equal to some constant. This is equivalent 
to putting the malicious user at an infinite distance from the 
BSs. From this simple model we will be able to analytically 
derive the false positive and detection rates, which will then 
allow a determination of the optimal threshold. However, the 
more realistic scenario involves the case where the malicious 
user can be anywhere, beyond some fixed distance from 
the network. We build some other threat models which can 
accommodate this more realistic scenario. We will then show 
how the optimal threshold derived for these other threat models 
is effectively the same as the optimal threshold for our simple 
threat model where all RSSs are constant. 



A. Infinitely- Far- Away Threat Model 

In this subsection we propose the deployment of our LVS 
within a specific threat model referred to as the Infinitely- 
Far-Away (IFA) model. To determine p(m\Hi) in this model, 
we assume a malicious user's true location is effectively 
infinitely far away from the BSs. That is, the distance of a 
malicious user's true location to every BS is approximated as a 
constant, i.e. d\ = d{6, 6f s ) = constant, Vi G [1,2,..., K]. 
Therefore, we will assume 

iE^^o(|)=107logio(|). d3) 



Substituting ([T3l into (fTZb . hi for the IFA model can be 
expressed as 



in (ef s ,e c ,w) =fi c + uj. 



(14) 



Then, p(rn\Hi) (which now does not depend on 6) can be 
written as 

p(m\Hi) = I] exp ( "^^l • d5) 

Based on (|4j, (0 and (15[ . we construct the decision rule 



A(m) 



exp V ^Tb 



exp 



> 
< 

V 



(16) 



In order to help determine a and /3 analytically, this decision 
rule can be rewritten as 



where 



and 



Pi 

F(m) ^ T, 



K 



l^ B lnT A -^((^) 2 -(^) 2 



(17a) 



(17b) 



(17c) 



Given (0 and ( TTTb , we have 



p (F(m)\H ) = N[J2^ &° - ) , E & - tif °$b) , 

\i=l i=l / 

where N(a, b) represents a normal distribution with a and b 
as the mean and variance, respectively. Likewise, given ( fT5] l 
and ( TT7i i, we have 

P (F(m)|Wi) = N ( M c (M c - l4) , E 0? - °dB ) ■ 

\i=l i=l / 

The false positive and detection rates under the IFA model can 
now be expressed analytically as 



a = P(A(m) > T A \H ) = P(F(m) > T\H ) 



Q 



(18) 



/3 = P(A(m) > T A \Hi) = P(F(m) > r\Ui) 



(19) 



where Q(x) = -4- f °° exp"* 2 / 2 dfc 

Having determined a and j3 under the IFA model, we can 
use these in (2) for the conditional entropy H(X\Y). The 
value of T which maximizes I(X;Y) = H(X) - H{X\Y), 
denoted as T* , can be determined numerically. Using ( 117c| ), 
the T A can be determined by T* . Then, the decision rule in 
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Fig. 2. Analytical and simulated a, /3 and £ with different values of T\, 
where K = 10, L = 1, and = 5. 



Fig. 3. Maximum Normalized Mutual Information (NMI) with different 
values of K and a^g, where L = 1. 



© which leads to the optimal verification algorithm under the 
IFA model can be formed, where A(m) is specified in ( TToT ). 

We verify the false positive and detection rates, given by 
dT8l and ( fl9l >. respectively, via detailed Monte Carlo simula- 
tions. The simulation settings are as follows: 

• The K BSs are randomly distributed in a 200m x 200m 
square field, 4 < K < 10. 

• The claimed locations of legitimate and malicious users 
are the same, which is C = [0,0]. 

• The legitimate users are at C . The malicious users are 
infinitely far away from C , which in practice means the 
measurements collected are generated according to (fT4l . 

• Each BS collects L measurements from each user. 

In the following, the simulation results are obtained through 
10,000 Monte Carlo realizations of the measurement vector 
M, and in all the specific results shown we have adopted the 
prior probability Pq = 0.9, and the path loss exponent 7 = 
3. Also note, we denote the Normalized Mutual Information 
(NMI) as 

i = I{X;Y)/H{X). (20) 

In Fig. |2] the analytical a and (3 are directly derived from 
(IT8b and ffT~9b . respectively, and the analytical £ is calculated 
using (g), (dD, ([T9l l and d20t . In order to obtain the simulated 
a, we randomly generate M according to (O, from which 
we get a specific realization of A(m), and for each value of 
Ta we decide whether the user is legitimate or malicious by 
([T6| >. To obtain the simulated f3, we randomly generate M 
according to (fT~4T >. and follow the same procedure as above 
for a. The simulated £ are calculated using (ffj and d20b with 
the simulated a and j3 as the input. In Fig. [2] we have set 
K = 10, L = 1, and g^b = 5. From Fig. [2] we can see that the 
comparison between simulation and analysis shows excellent 
agreement, which verifies the analysis we have provided for 
the IFA threat model. We have investigated a range of other 
values of K, L and OdB- Some of these results are shown in 
Fig. |3] where the maximum NMI is shown as a function of 



&dB for different values of K. From Fig. [3] we see again the 
simulations agree with the analytical results. 

B. Circle Threat Model 

In this subsection we propose the Circle threat model, where 
the malicious users are assumed to be uniformly distributed on 
a circle. More specifically, the malicious user's true location 
is uniformly distributed on a circle, whose radius and center 
are R and C , respectively, where R > r, and r = max(d?). 

The main purpose of this model is to commence our probe 
of how reliable the use of the IFA will be when its assumptions 
are violated. To this end, we note that if the maximum 
difference between any two measurements collected from a 
malicious user, Mi and Mj (i 7^ j), is no larger than <JdB, 
the scale R at which this occurs provides a natural distance 
at which we could anticipate the IFA and the Circle threat 
models to be approximately equivalent. To quantify this let 
us introduce p = R/r. Under the Circle threat model, the 
difference between AL and Ma can be written as 



\Mt-Mj\ = 



10710^ 



10 



— IO7 log 



10 



Given that for a malicious user, we have max(|<i* - 
and min(d*) = R — r, we can write OTI) as 



(21) 

= 2r, 



\M,-MA < 



IO7 
In 10 



In 



d* + 2r 



< 



IO7 
In 10 



In 1 



2r 
R-r 



where without loss of generality we have assumed dh > d\. 



order to guarantee the required constraint max | Mi 
UdB, we should have 



Mi 



In 

< 



IO7 
In 10 
which results in 

P > P 



In 1 



2r 



R 



< &dB, 



exp 



{ <J dB lnio l 
1 107 / 



(22) 
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p = 0.1 x p 



p = 0.2 xp 





Fig. 4. Normalized Mutual Information (NMI) with different values of p, Fig. 5. Numerical and IFA approximated Normalized Mutual Information 
where K = 10, L = 1, cr^B = 5, and p* = 5.275. The solid curves (NMI) with different values of i?2 , p* = 5.275, p = 0.2 X p*, and cr ds =5. 
represent the NMI achieved under the correct decision rule (5)- The dashed The solid curves represent the NMI achieved under the correct decision rule 



curves represent the NMI achieved under the IFA decision rule )16t . 



{5J. The dashed curves represent the NMI achieved under the IFA decision 
rule 01). 



where p* is a reference value that will be utilized when 
comparison with the IFA model is made. Such a comparison 
is achieved by using the IFA decision rule in ( fToT l but under 
the Circle threat model. In such a set up we would anticipate 
that the optimal thresholds of the IFA and the Circle model 
would be very similar at p* . 

To proceed with a comparison of the IFA and the Circle 
threat models we conduct Monte Carlo simulations. In these 
simulations, note that although p(m\Ho) as given by (0 is 
used, the likelihood p{m\H{) given 

p(m\'H 1 ) = j ' p(m\e,Hi)p(0\Hi)dO. (23) 

must be determined numerically. The other simulation settings 
are the same as in the IFA model, except the malicious users 
are uniformly distributed on a circle, whose radius and center 
are R and 9 C , respectively. The measurements collected from 
the legitimate and malicious users are generated according 
to and (fT2l . respectively. To obtain the true numerical 
NMI under the Circle model, we use (0 and d23l to calculate 
p(m\Ho) and p(m\Hi), respectively, and utilize (0 as the 
decision rule. To simulate the NMI obtained from the use of 
the IFA decision rule (but under the Circle threat model) we 
use (0 and dl5l l in order to implement the decision rule in 
(US). 

From our results, shown in Fig. we can see that at 
values of p << p* the optimal thresholds (the values of 
T\ which maximize the NMI) for the two cases are very 
different. However as p — > p* the optimal threshold obtained 
under blindly adopting the IFA decision rule (even though 
the malicious user is not at infinity) is effectively the optimal 
value. Note also, the maximum values of the two NMIs and the 
corresponding Tj£ are coincident when p = p* , which verifies 
that the reference value of p in ( 1221 is reasonable. 



C. Annulus Threat Model 

In this subsection we implement the optimal location verifi- 
cation algorithm under a more realistic threat model, referred 
to as the Annulus threat model. In this model p{9\Hi) is 
assumed to be uniform over the annulus formed by two con- 
centric circles, whose finite radii are R\ and i?2 (Ri < R2), 
respectively, and whose mutual center is 6 C . This implies 
p(6\Hi) = l/n(R% - Rf), where 6 e {6\R\ <{u- u c ) 2 + 
(v — v c ) 2 < R%}. Under this model, p(rn\Ho) is also the 
same as in (0, and p{m\%x) is as given in d23l but with the 
modified prior distribution. Again, no closed form solution is 
available for d23l . 

1) Monte Carlo: In these new Monte Carlo simulations 
the settings are again the same as in the IFA model, except 
that now the malicious users are uniformly distributed in 
the annulus. Again, the measurements collected from the 
legitimate and malicious users are generated according to (0 
and ( fT2l . respectively. To obtain the true numerical NMI under 
the Annulus model, we use (O and ( 1231 to calculate p(m\Ho) 
and p(m\Hi), respectively, and utilize (0 as the decision rule. 
To simulate the NMI obtained from the use of the IFA decision 
rule (but under the Annulus threat model) we use (0 and ( fl3b 
in order to implement the decision rule in (116b . The results of 
our simulations are shown in Fig. where p = 0.2/3*, and p 
is redefined as p = R\jr. 

In top left plot of Fig. we have set R2 = Ri, so in 
this specific plot the Annulus threat model is equivalent to 
the Circle threat model (the result is the same as that shown 
in the top right plot of Fig. 0. However, again we see that 
as i?2 increases the optimal threshold obtained under blindly 
adopting the IFA decision rule (even though the malicious 
users are constrained within an annulus) is effectively the 
optimal value. Note also, that in the Annulus threat model, 
as i?2 increases this results holds for cases when p << p* 



(which was not the case in the Circle threat model). 

2) Analytical Laplace Approximation: As a final point, 
we note that instead of numerically solving d23l l, it may be 
useful to find an approximate closed-form solution to ((23} 
{e.g. this would allow for an approximate closed-form for the 
false positive and detection rates under this threat model). We 
can approximate p(m\Hi) via an application of the Laplace 
approximation, which can approximate integrals through a 
series expansion by using local information about the integrand 
around its maximum ||30l 1311 . The details are as follows. First, 
let us define a quantity as: 

M0|^i) = MMH0>^iM0|^i))- 

h(6\'Hi) can be expanded using a Taylor series around its 
maximum a-posteriori (MAP) estimate, denoted by © = 
arg max {p(m\0, T-Li)p(6\Hi)} . This is the point where the 



posterior density is maximized, i.e., the mode of the posterior 
distribution. Hence, we obtain to second order, 



h(0\Hi 



h [B\Ui 







t dh(@\Hi 



80 t 

(=0) at MAP location 

(24) 



2("- ) — k- 
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The second term in d24"l i is 0, because the first derivative is 
zero at the maximum of h (0\Hi). Replacing h (0\Hi) by the 
truncated second-order Taylor series yields: 

h (e\Hi) « h (&\Hi) + x - (e - &Y h (e - 0) , 

where H is the Hessian of the In posterior, evaluated at 0: 



H 



d 2 h l&lHi 



d 2 



d 2 h{9\U 1 ) 



d0d9 H 



Using the above approximation, we have the following 
lnp(m|Hi) =ln J p (m\0 ,U X ) p (0|Wi) dd 
= ln J cxp^^dO 
«ln/"exp fe (®l^)+K e -®) rH ( -®) dO 
=h(&\Hi) +ln|exp5(^ S ) T H(e-®)^ 



| + iln|27rH| 



= lnp[@\Hi \ + \np[m\&,n 1 

+ ln|2 7 rH- 1 | 1/2 . 
Finally, the marginal likelihood estimate can be written as 

r-l|V2 



p(m\Hi) =p [®\Hi )p (m\G,Hi) ^ttH" 1 



(25) 



In ((25]l, p[®\Hi) and |27rH _1 1 1/2 are both constant for a 




Numerical with K = 10 
Laplace-App with K = 1 
Numerical with K = 7 
Laplace-App with K = 7 
Numerical with K = 4 
Laplace-App with K = 4 
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False Positive Rate (a) 
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Fig. 6. Numerical and Laplace approximated ROC curves with different 
values of K, where = 5, L — 1, ill = 100m, and R2 = 500m. 



p(m\Hi), is a K dimensional normal distribution with the 
same variance as p(m\Ho) (because the variances of p(m\Ho) 
and p(m\6, Hi) are the same). Under the Laplace approxima- 
tion the decision rule in (O is approximated by 



A(m) 



p(m\Hi) > 
p(m\H ) 



< 



T A . 



(26) 



specific 0; thus the Laplace approximated likelihood function, 



To study the performance our Laplace approximation we 
calculate ROC curves for both the numerical Monte Carlo 
calculation of p(m\Hi) and the Laplace approximation of 
p(m\Hi). These different forms are then used in the same 
decision rule (0 in order to form the ROC curves. The results 
of these simulations are shown in Fig. |6] and as we can see 
the approximation is a good one for the parameters used. We 
have further investigated the accuracy of the approximation 
over a range of other parameters, finding similar results to 
those shown in Fig. |6] 

D. Discussion 

In the preceding sections we have looked at a general attack 
scenario under specific threat models. The attack scenario we 
have focussed on is that of a non-colluding adversary who is 
attempting to spoof he is within the perimeter of some wireless 
network, when in reality he is some distance beyond the 
network boundary. A non-colluding adversary who is within 
the network region will in general be easily identified due 
to his inability to set different received signal strengths at 
different BSs. An attack from outside the network region is 
the perhaps most realistic and likely scenario one can imagine 
for the emerging ITS scenario. For example, a single adversary 
who is some distance from the highway (so as not to be easily 
identified or caught) is attempting to disrupt proper functioning 
of the ITS on the highway. 

What we have shown through our investigations of specific 
threat models under our general attack scenario, is that an 
optimal LVS can be developed for each threat model, but 
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in a non-straightforward manner in most cases - i.e. no 
closed-form solutions for the detection and false positive rates 
are available. Without such closed-form solutions for these 
rates, one must resort to complex and time consuming Monte 
Carlo simulations in order to determine the optimal threshold. 
However, from considerations of the IFA model, and how 
other threat models can be approximated by the IFA model, 
we have shown how a straightforward LVS algorithm can be 
deployed which is effectively optimal for most circumstances. 
More specifically, using analytical solutions for the detection 
and false positive rates in the IFA setting, which are then 
used in easily determining the optimal threshold value, a 
straightforward LVS is developed whose performance is near- 
optimal when the adversary is close to the network, and 
optimal as the adversary moves to a large distance from 
network. 

However, of course more sophisticated attacks than those 
highlighted above are possible. The most obvious of these 
is that of colluding adversaries who can communicate and 
cooperate with each other so as to form collective attacks on 
the LVS. An example of such an attack would be colluding 
adversaries who set different received signal strengths at differ- 
ent BSs. On the defensive side, the network could also deploy 
beam-forming techniques to help the LVS thwart these types 
of attacks. The LVS could also deploy tracking algorithms and 
physical layer security techniques to assist in its defense. These 
more sophisticated forms of attack and their corresponding 
defensive strategies are out of scope of the current work, but do 
form part of our ongoing research efforts in this area. However, 
we should be clear that ultimately any defensive strategy for 
an LVS is ultimately doomed if the colluding adversary is 
afforded unlimited resources and the communications network 
is purely classical in nature@ 

V. Conclusions 

In this paper, we developed an information-theoretic frame- 
work for an LVS, utilizing as the objective optimization 
criterion the mutual information between input and output 
data of the LVS. We investigated our new optimal LVS under 
a series of threat models, showing how in a straightforward 
implementation of an LVS, information-theoretic optimality 
is approached as the non-colluding adversary moves further 
from the network region it is claiming to be within. This 
straightforward implementation of our new algorithm is an 
ideal candidate for the LVS that will be needed in emerg- 
ing network-based and safety-enhanced transportation systems 
such as ITS. 
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3 Note that location verification in the cont ext o f qua n tum communications 
systems has previously been considered e.g. 1321 . 1331 , 1341 . and it has been 
argued that such systems are able to securely verify a location under all 
known threat models 1351 - although see 1361 who argue otherwise. It is 
undisputed that classical communications alone cannot achieve secure location 
verification under all known threat models. 
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